(ACM SRC Poster) PEAK: Parallel EM Algorithm using Kd-tree

نویسنده

  • Laleh Aghababaie Beni
چکیده

The data mining community voted Expectation Maximization (EM) algorithm as one of the top ten algorithms having the most impact on data mining research [5]. EM is a popular iterative algorithm for learning mixture models with applications in various areas from computer vision, astronomy, to signal processing. We present a new high-performance parallel algorithm on multicore systems that impacts all stages of EM. We use tree data structures and user-controlled approximations to reduce the asymptotic runtime complexity of EM with significant performance improvements. PEAK utilizes the same tree and algorithmic framework for all the stages of EM. Experimental results show that our parallel algorithm significantly outperforms the state-of-the-art algorithms and libraries on all dataset configurations (varying number of points, dimensionality of the dataset, and number of mixtures). Looking forward, we identify approaches to extend this idea to a larger scale of similar problems.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On Speeding up the Em Algorithm in Pattern Recognition: a Comparison of Incremental and Multiresolution Kd-tree-based Approaches

Finite mixture models implemented via the EM algorithm are being increasingly used in a wide range of problems in the context of unsupervised statistical pattern recognition. As each E-step visits each feature vector on a given iteration, the EM algorithm requires considerable computation time in its application to large data sets. We consider two approaches, an incremental EM (IEM) algorithm a...

متن کامل

Estimating Suspended Sediment by Artificial Neural Network (ANN), Decision Trees (DT) and Sediment Rating Curve (SRC) Models (Case study: Lorestan Province, Iran)

The aim of this study was to estimate suspended sediment by the ANN model, DT with CART algorithm and different types of SRC, in ten stations from the Lorestan Province of Iran. The results showed that the accuracy of ANN with Levenberg-Marquardt back propagation algorithm is more than the two other models, especially in high discharges. Comparison of different intervals in models showed that r...

متن کامل

On some Variants of the EM Algorithm for the Fitting of Finite Mixture Models

Finite mixture models are being increasingly used in statistical inference and to provide a model-based approach to cluster analysis. Mixture models can be fitted to independent data in a straightforward manner via the expectation-maximization (EM) algorithm. In this paper, we look at ways of speeding up the fitting of normal mixture models by using variants of the EM, including the so-called s...

متن کامل

Blocking in Parallel Multisearch

External memory (EM) algorithms are designed for computational problems in which the size of the internal memory of the computer is only a small fraction of the problem size. Block-wise access to data is a central theme in the design of eecient EM algorithms. A similar requirement arises in the transmission of data between processors in certain parallel computation models such as BSP* and CGM, ...

متن کامل

Highly Parallel Fast KD-tree Construction for Interactive Ray Tracing of Dynamic Scenes

We present a highly parallel, linearly scalable technique of kd-tree construction for ray tracing of dynamic geometry. We use conventional kd-tree compatible with the high performing algorithms such as MLRTA or frustum tracing. Proposed technique offers exceptional construction speed maintaining reasonable kd-tree quality for rendering stage. The algorithm builds a kd-tree from scratch each fra...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015